Effective Decomposing Approach for Historical XML Documents

نویسندگان

  • Ming Shien Cheng
  • PingYu Hsu
  • MinTzu Wang
چکیده

Recently, XML is widely used as the de facto standard for data representation and exchanging in Internet. In 2006, office application groups such as OpenOffice.org and Microsoft office both adopted XML as the main data storage format. Historical XML documents often have tiny differences between versions, but are stored individual independent space, so the abilities for efficient storing historical office documents are become a growing issue. This paper introduces an efficient way to decompose multi-version XML documents and store effectively for advanced retrieving. Not only effective storage space but also keeping the integral of original documents is the characteristic of our research. It minimizes the change of data content and structures when transmute historical XML documents. For enterprises, the approaches of our research can manage electronic documents in proper way and all messages in document were preserved to reuse.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Managing Multiversion Documents & Historical Databases: a Unified Solution Based on XML

XML can provide a very effective environment for the preservation of digital information whereby historical information can be easily preserved and searched through powerful historical queries. We propose a unified approach to represent multiversion XML documents and transaction-time databases in XML, and show that temporal queries can then be expressed in standard XQuery. In our demo we demons...

متن کامل

XML and Knowledge Technologies for Semantic-Based Indexing of Paper Documents

Effective daily processing of large amounts of paper documents in office environments requires the application of semantic-based indexing techniques during the transformation of paper documents to electronic format. For this purpose a combination of both XML and knowledge technologies can be used. XML distinguishes between data, its structure and semantics, allowing the exchange of data element...

متن کامل

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Temporal queries and version management in XML-based document archives

By storing the successive versions of a document in an incremental fashion, XML repositories and data warehouses achieve: (i) the efficient preservation of critical information and (ii) the ability to support historical queries on the evolution of documents and their contents. In this paper, we present efficient techniques for managing multi-version document histories and supporting powerful te...

متن کامل

Mining Frequently Changing Substructures from Historical Unordered XML Documents

Recently, there is an increasing research efforts in XML data mining. These efforts largely assumed that XML documents are static. However, in many real applications, XML data are evolutionary in nature. In this paper, we focus on mining evolution patterns from historical XML documents. Specifically, we propose a novel approach to discover frequently changing structures (FCS) from a sequence of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012